Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added multicolumn aggregation to DBA and improved three essential parts which suffer from many chunks #1069

Open
wants to merge 3 commits into
base: dev
Choose a base branch
from

Conversation

s3inlc
Copy link
Member

@s3inlc s3inlc commented Jun 20, 2024

This PR solves some of the issues with loading/handling of systems with large amount of tasks or larger number of agents:

  • When creating chunks, so far there was a global lock file in place which was locked no matter for which task a chunk was about to be created. It is enough to lock by each task, as creating two chunks for two different tasks at the same time poses no problem.
  • When having active tasks with very large amount of chunks, the chunk assignment on the server may take so long, that the agent thinks the connection has timed out (30s). This can lead to situations where all agents end up in the loop of requesting chunks/tasks and the server is not able to respond fast enough anymore due to an active task it has to check where it looped over all chunks (which obviously took time the larger the number of chunks). This is solved by the changes in TaskUtils where instead of looping over all existing chunks, it is only looping over the non-completed ones and uses SUM in sql to determine the other needed values.
  • In general there are a lot of places in the code, where multiple queries are done summing/counting/maxing over columns (or even worse, in most of the places, the entries are loaded and looped over in the code). When having a lot of chunks, the biggest issue happened in the getTaskInfo() function, which is used by the current old UI, causing very long loading times of the tasks page if tasks with many chunks were listed. In order to tackle this, the DBA was extended slightly to be able to sum/count/max/.. over multiple columns with the same query (as long as the where conditions were the same for all them). In the specific case of the getTaskInfo() function, this reduced the loading times by 5-6 times approximately.

The DBA function change was already adapted for the getChunkInfo() function as well, but there may be many places still around where potentially using either a aggregation function alone or using the newly added multicolumn aggregation could give potential additional speedups.

zyronix and others added 3 commits April 22, 2024 10:48
Adding loops to scan through lines to support importing hashes longer then 1024 bytes
@s3inlc s3inlc requested a review from zyronix June 20, 2024 12:08
@s3inlc s3inlc changed the base branch from master to dev September 27, 2024 13:46
@s3inlc s3inlc requested review from jessevz and removed request for zyronix September 27, 2024 13:47

$totalTimeSpent = $results[$agg1->getName()] - $results[$agg2->getName()];

$agg1 = new Aggregation(Chunk::CHECKPOINT, "SUM");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using magic strings like "MAX", "SUM" and "COUNT", it is probably nicer to create constants inside the Aggregation class and use that.

$qF3 = new QueryFilter(Chunk::SOLVE_TIME, 0, ">");
$agg1 = new Aggregation(Chunk::SOLVE_TIME, "SUM");
$agg2 = new Aggregation(Chunk::DISPATCH_TIME, "SUM");
$results = Factory::getChunkFactory()->multicolAggregationFilter([Factory::FILTER => [$qF1, $qF2, $qF3]], [$agg1, $agg2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this query redundant and could you get all information from the query on line 399, if we just add agg1 and agg2 to that query? I understand that the query on this line also contains filters for when the dispatch time and solve time are bigger than 0, but since you aggregate them anyway with a sum, it doesn't really matter because the values that are 0 have no effect on the sum. By getting all values out of the query on line 399 we reduce the load on the database.

@jessevz jessevz self-assigned this Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🧐 Review
Status: 🧐 Review
Development

Successfully merging this pull request may close these issues.

3 participants